Topographic analysis of pancreatic cancer by TMA and digital spatial profiling reveals biological complexity with potential therapeutic implications

Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal human malignancies. Tissue microarrays (TMA) are an established method of high throughput biomarker interrogation in tissues but may not capture histological features of cancer with potential biological relevance. Topographic TMAs (T-TMAs) representing pathophysiological hallmarks of cancer were constructed from representative, retrospective PDAC diagnostic material, including 72 individual core tissue samples. The T-TMA was interrogated with tissue hybridization-based experiments to confirm the accuracy of the topographic sampling, expression of pro-tumourigenic and immune mediators of cancer, totalling more than 750 individual biomarker analyses. A custom designed Next Generation Sequencing (NGS) panel and a spatial distribution-specific transcriptomic evaluation were also employed. The morphological choice of the pathophysiological hallmarks of cancer was confirmed by protein-specific expression. Quantitative analysis identified topography-specific patterns of expression in the IDO/TGF-β axis; with a heterogeneous relationship of inflammation and desmoplasia across hallmark areas and a general but variable protein and gene expression of c-MET. NGS results highlighted underlying genetic heterogeneity within samples, which may have a confounding influence on the expression of a particular biomarker. T-TMAs, integrated with quantitative biomarker digital scoring, are useful tools to identify hallmark specific expression of biomarkers in pancreatic cancer.


TMA construction and immunohistochemistry
Representative formalin-fixed, paraffin-embedded tissue samples were derived from 12 resected PDAC, accessed through the Northern Ireland Biobank 19 .H&E stained slides from each case was reviewed and donor blocks selected which contained the pathophysiological hallmarks outlined below.Sections were digitally scanned on an Aperio AT2 at × 40 magnification and scans transferred to a fully automated tissue microarrayer (TMA Grandmaster, 3DHISTECH (Budapest)).H&Es were digitally annotated by consultant clinical diagnosticians (MST and SMcQ) to select the distinct regions depicting the seven pathophysiological hallmarks described.Briefly, H&E image were presented on screen within the 3DHITECH CaseViewer software.TMA core annotations were digitally placed on areas of interest on the H&E images.Area selection was confirmed by consensus.From each of the 12 cases, up to 7 selected 1.0 mm cores were placed into a recipient TMA block and sections prepared for analysis, representing a total of 72 tissue cores.
To establish the accuracy of the TMA-targeted model to capture focal pathophysiological hallmarks of cancer, we compared the intended purpose of each TMA core against the histological or immunohistochemical evidence by direct analysis by two experts (MST & SMcQ).T-cell density (CD3) in individual cores and the relationship between desmoplasia (SMA) were quantified using HALO image analysis v2.2.IDO-1 and TGF-β were assessed microscopically.Image analysis quantification was conducted on c-Met protein with Definiens Architect v2.7 and for mRNA expression using HALO v2.2

Next generation sequencing
H&Es from representative tumour and normal blocks from each case was annotated to enrich for tumour or normal epithelial content.4 × 10 µm thick sections from each block were macrodissected and DNA extracted with a Promega Maxwell 16 FFPE tissue LEV DNA extraction kit on a Maxwell 16 instrument.DNA was quantified via Qubit (ThermoFisher) and QC performed using the Agilent 4200 Tapestation.NGS libraries were prepared and hybridisation-based target enrichment performed using the Roche HyperCap workflow and associated kits.For determination of mutational status a custom designed Next Generation Sequencing (NGS) panel (Roche SeqCap EZ) to detect single nucleotide variants (SNVs) in genes was employed, including BRAF, CDKN2A, KRAS and TP53 which are frequently mutated in solid tumours (see supplementary table S1 for full panel details).This panel and kit were validated in-house and the final library was sequenced on a NextSeq 500 instrument using a Mid Output v2.5 (150 cycles) cartridge.The sequenced samples were then demulitplexed using Illumina's bcl2fastq (v2.20.0.422).The generated fastq files were then aligned to human reference genome (v GRCh38/hg38) using Burrow-Wheeler Aligner (bwa, v0.7.17) 21following which Samtools (v4.0.12.0) 22 was used for sorting, merging and filtering the bam files generated by bwa.GATK4's (v 4.0.12) 23picard was used to sort, mark and remove the duplicated reads.GATK4 was also used for local realignment of reads around INDELs and base recalibration.Single nucleotide variants (SNVs) were detected against matched normal samples using GATK4 Mutect2 24 .GATK4 FilterMutectCalls was used for filtering the called variants and SNPEff 25 was used to annotate the final filtered calls.
www.nature.com/scientificreports/Nanostring digital spatial profiling 3 × 4 µm sections were cut from the PDAC TMA block onto the centre of positively charged Superfrost Plus Micro slides.Sections were air-dried at room temperature overnight and shipped with dessicant to Nanostring (Nanostring Technologies, 530 Fairview Avenue Norh, Seattle, W 98109, USA).Nanostring technologies then performed Nanostring Digital Spatial Profiling (DSP) via their Technology Access programme (TAP).This procedure allows the accurate selection of regions of tissue based on protein stains and subsequent RNA extraction and sequencing of cells from those regions.Two different protein stains pan-cytokeratin (panCK) and the perineural marker S100β were used to drive the selection of Area of Illumination (AOIs) within and across the cores of our TMA.This allowed us to determine three different regions: (1) regions positive for panCK to highlight tumour; (2) regions positive for S100β to indicate perineural involvement adjacent to tumour; (3) other stromal regions negative for both panCK and S100β, according to established methodology 26 .Initial quality control of the RNA hybridization data demonstrated the experiment was successful and that RNA was well preserved in all TMA cores.14,000 genes were above the Limits of Quantitation and available for analysis.

Ethics approval and consent to participate
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.Corresponding tumour slides and blocks were retrieved and collated via the Northern Ireland Biobank.In accordance with the Human Tissue Act 2004, consent is not required for use of archived, de-identified tissue in research studies with ethical approval 42 .

Topographic analysis
Figure 1A describes the design of the pancreatic TMA.All the proposed topographic areas were arrayed in 12 cases, with 2/12 displaying additional definite regions of intravascular tumour as shown (Fig. 1Avii 4 and 9). Figure 1B illustrates a representative area rich in: tumour (i), desmoplastic reaction (ii), peri-neural invasion (iii), peri-vascular extension (iv), tumour margin (epithelial-mesenchymal transition) (v), lymphocytic inflammation (vi), and intravascular invasion when obvious (vii).The transfer of each pathophysiological region of interest to the TMA was highly accurate, as confirmed by visual pathological identification of the histological structures or processes and by immunohistochemical detection.
Figure 1C demonstrates the wide heterogeneity of immune reaction (CD3) (i) and desmoplasia (SMA) (ii) in each topographical area as measured quantitatively using HALO image analysis v2.2.These data are further represented at the individual case level in supplementary figs.S1 and S2, respectively.Figure 1D shows the range of T-cell density (CD3) (i), desmoplastic reaction (SMA) (ii) and IDO1 expression (iii) in the different topographical areas.Complete data for all topographical areas was available in cases, 3-7 and 10-12 for CD3, in cases 6, 7 and 10 for SMA and in cases 1-2 and 5-11 for IDO1.CD3 expression varied to a greater degree across the cases while SMA displayed a more consistent positivity quantitatively.
Analysis also provided early evidence of two interesting observations about IDO-1 and TGF-β mRNA.Focal expression of IDO-1 on tumour epithelial cells was seen in 7/12 patients but would only have been 3/12 cases if only tumour rich cores had been sampled.In case 10 IDO1 expression was only seen on a small number of tumour epithelial cells at the tumour margin and in case 11 IDO1 expression was only observed on tumour cells in TMA cores taken from peri-neural and lymphocytic rich regions (Fig. 1D iii).
In only 2/12 cases was expression of TGF-β mRNA observed and only on 2 of the 7 TMA cores in each case.Expression was of low or moderate intensity when compared with hybridisation-specific house-keeping genes (Fig. 1E).Such low level expression of TGF-β mRNA would not have been observed if only epithelial rich tumour regions had been sampled.We also observe that the only cases positive for TGF-β were indeed the cases with the highest desmoplastic reaction (Table 1).

Analysis of c-Met
c-Met expression, evaluated by immunohistochemistry and RNA-ISH, demonstrated positivity in the tumour epithelium of all patients (12/12).The quantitative analysis of c-Met expression showed variability by both techniques in the patient cohort.Figure 2A(i) illustrates the overall significant positive correlation between IHC compared with RNAScope (Spearman r = 0.551 p < 0.0001).However in some cases a high level of discordance across the two techniques was observed, examples of staining are shown in Fig. 2A(ii).Additionally, the expression of c-Met IHC, quantified as percentage positivity in tumour above the detection threshold determined by image analysis, in the different topographic areas of each patient was heterogeneous (Fig. 2B(i)).Likewise, the expression of c-Met RNA-ISH, quantified by probe copy number µm 2 in tumour, also showed wide-ranging expression levels across the cases (Fig. 2B(ii)).Both techniques showed expression that predominated in the tumour rich area, followed by tumour margin, desmoplastic reaction, peri-vascular extension and peri-neural invasion, with a slightly lower expression in the lymphocytic inflammatory area.Complete data for all topographical areas was available in cases 3, 5-7 and 10-12 for c-Met IHC and case 5 for c-Met RNA-ISH.

Mutational analysis
NGS analysis showed the presence of somatic variants in 11 of 12 PDAC cases (Table 1).The most frequent mutations were found in KRAS, TP53 and CDKN2A.KRAS mutations were found in 10/12 of cases, TP53 in 9/12 of cases and CDKN2A in 3/12 of cases; all of these were consistent with the known mutational profile of PDAC.Other somatic allelic variants were found only once and in different cases.While the numbers in this study are small to establish firm immuno-genomic correlations, interesting patterns are starting to emerge.The case with highest mutational burden (case 12), also showed high expression levels of IDO-1 (IHC) and c-Met (IHC and RNA-ISH).The top two highest scoring cases by IHC, cases 9 and 11, that also show high RNA scores, both show the same mutation pattern, having mutations in KRAS, TP53 and CDKN2A.The one case that carried a mutation in the mismatch repair gene PMS2 (case 3), had the highest expression of CD3-positive T-cells.Finally, case 1, containing no detectable mutations, contained a low tumour cellularity (20%) had the lowest desmoplastic reaction, as measured by SMA, while case 12, which had the highest number of mutations, also had the most marked desmoplastic reaction.In Case 3, which had the lowest tumour content (10-15%), a PMS2 mutation was found suggesting that the lack of KRAS mutation in this sample is a genuine finding.The mean target coverage in tumours for the genes on the panel was between 95 and 745, with the lowest being in Case 4, in which two mutations were detected.Cases 1 and 3, in which no KRAS mutations were detected, had a mean target coverage in tumours of 746 and 461, respectively which indicates that these tumours are truly wild-type and that there are no sampling or technical issues resulting in false negatives.
Mutational heterogeneity across the cases was clearly apparent.7/12 cases displayed unique mutations.Only 3/12 cases displayed exactly the same mutations in KRAS, TP53, with the remaining two cases showing identical gene expression (KRAS, TP53, CDKN2A).

Nanostring digital spatial profiling
Principal Component Analysis (PCA) of the RNA data showed that samples clustered primarily on their panCK status, with panCK+ samples being clearly separate from the combined samples (Fig. 3A).These data indicate that samples from the same case show higher similarity than those from other individuals, as no clustering was observed based on topography (PCA not shown).www.nature.com/scientificreports/Unsupervised hierarchical clustering of the 1000 most variable genes across all datasets clustered together based on the IHC status (Fig. 3B).The heatmap reveals clear differentially expressed genes clustering across sample subtypes.12 of the 14 panCK+ samples show similar expression patterns, whereas two (both from case 5) show distinct differences and cluster at the edge of the panCK+ samples (Fig. 3B, green box).Multiple samples from case 5 show upregulation of a subset of genes not seen in the other cases, with the S100β+ sample from case 5 showing distinct differences from other S100β+ samples (Fig. 3B, blue box), and likewise seen in panCK-ve/ S100β− samples (Fig. 3B, red box).
Case 5 evidences precisely the range of heterogeneity that may be present within different topographical cores from the same case.panCK+ regions from three different topographic cores from case number 5 showed a clear split into a pair of similar expression patterns (perivascular extension and perineural invasion), and one (tumour margin) that showed a markedly different expression pattern (Fig. 3C).When these samples are seen in the context of all samples tested, the tumour margin sample is more similar to the panCK+ expression patterns seen in other cases and topographical cores, with the perivascular extension and perineural invasion samples being situated on a different branch of the cluster tree (Fig. 3B).
K-means clustering of gene expression patterns across case 5 split in to six clear gene ontology biological processes (Table 2).Unsurprisingly, perivascular extension and perineural invasion samples showed higher expression of genes involved in complement cascade, cell junction organisation, post-translational protein phosphorylation, interleukin-10 signalling, platelet degranulation and chemokine-chemokine receptor pathways (Table 2 Cluster C).Interestingly, these samples also showed higher expression of genes involved in pancreatic secretion, protein and fat digestion and absorption (Table 2 Cluster E).
In addition to the heterogeneity across topographical cores from the same case there is also evidence of heterogeneity within individual cores, as shown in the peri-vascular extension core of case 12. Two distinct but matching AOIs were interrogated within this core, each sample covering panCK+, panCK-ve/S100β− and S100β+.Whilst clustering shows that the IHC subsets do cluster more closely than samples from the individual regions of the core (Fig. 3D), the correlation in gene expression between the IHC subtypes is lower than would have been expected.Particularly for the S100β+ samples (r = 0.483), indicating that although the two S100β+ sites sampled are within the same TMA core, they show a sizeable degree of heterogeneity just a few micrometers from one another.

Discussion
The nature and dynamics of PDAC immune composition is largely unknown but is presumed to be influenced by factors such as desmoplasia 27 .Therefore, it would appear that a detailed analysis of biomarkers related to specific pathophysiological hallmarks may be essential to understand biomarker heterogeneity and complexity.www.nature.com/scientificreports/Herein, we introduce the concept of Topographic TMAs as a technical platform to interrogate the heterogeneity of cancer pathophysiology, towards the characterisation of a genuine tumour microenvironment.We demonstrate that such a design is feasible and can be verified with a small set of well characterised and targeted immunohistochemical biomarkers.Other studies that have utilised TMAs in the context of PDAC have adopted a more limited approach whereby cores are only described as derived from tumour or from the invasive margin but without detailed information on histological selection [28][29][30] .Using expert pathologist annotations (the current gold standard ground truth), we have captured specific pathophysiological hallmarks for each case.The morphological heterogeneity of PDAC is well established in the literature, and is core to the traditional description of the disease in which most of the topographic changes are well-described 31 .Our work set out to (1) capture this morphological heterogeneity in morphology-driven TMA construction; (2) confirm this heterogeneity with markers associated with this morphology; (3) once confirmed in (1) and ( 2), describe the potential heterogeneity of further biomarkers that may have therapeutic implications.Accepting the heterogeneous nature of PDAC, we demonstrate that our T-TMA model is able to capture such heterogeneity.Our anticipated immunohistochemical detection of biomarkers evidences that our approach does indeed capture morphological heterogeneity.
In this proof of concept study we did not detect a relationship between T-cell infiltration and desmoplasia using CD3 and SMA as surrogates.The expression of TGF-β mRNA in 2 of the cases is interesting, as increased TGF-β in the tumour microenvironment represents a primary mechanism of immune evasion that promotes T-cell exclusion and blocks acquisition of the TH1-effector phenotype 30 .Immunotherapies directed against TGF-β signalling may therefore have broad applications in treating patients with advanced cancer 32 .Specifically, in early stage PDAC patients TGF-β1 overexpression has been reported to be associated with improved survival and low tumour cell proliferation 2 .We believe this is the first report to date to demonstrate TGF-β expression in PDAC by RNA-ISH technology may have a highly focal pathophysiological distribution.
IDO-1 expression was observed on tumour cells in 7/12 cases but would only have been detected in 3/12 cases if the TMA had only contained tumour rich cores.Studies on the expression of IDO-1 in PDAC are relevant as the immunosuppressive IDO pathway has been shown to be impeded in a syngeneic mouse model of PDAC using a nanocarrier approach 33 .Quantitative c-Met expression was variable across cases and between the different topographical areas within each case, reinforcing the benefit of a topographic TMA approach to obtain a holistic perception.Pre-clinical studies suggest that the inhibition of the HGF/c-MET pathway may sensitise PDAC tumours to gemcitabine 6 .Combined approaches in orthotopic models result in a decrease in primary tumour volume and a significant reduction of metastatic tumour burden.This may represent a novel therapeutic strategy to improve outcomes in PDAC.
Mutational analysis showed that the three most frequent somatic allelic variants were in KRAS, TP53 and CDKN2A.These findings, albeit in a small sample cohort, are consistent with previous genomic characterizations in PDAC 33,34 .The unique distribution of mutated genes in our panel across the cases demonstrates that, even following meticulous topographical histological selection, the underlying gene profile can have a confounding influence on the expression of the particular biomarker being investigated.We are aware of the limitations of our analysis in that this panel contains on a small portion of coding regions of the human genome and only identifies a number of key driver cancer genes.Furthermore, though our study is in line with previous genomic observations, we must resist over interpretation and generalisations within our small sample feasibility study.
Digital spatial profiling revealed gene clustering was strongly correlated based on the IHC status across all samples, suggesting tumour margins (panCK+) may be comparable across cases.However, intra-case and intracore topographic sampling revealed broad heterogeneity with regard to biological processes, which may influence the local PDAC microenvironment.Our limited sample number precludes a robust comparison with other published PDAC gene expression profiles.However, the reliability of our analysis is buoyed by the correlation of our expression patterns in the panCK+ AOIs with that of two differential gene expression (DGE) studies on PDAC and normal tissue (data not shown) 35,36 .Direct comparison with these DGE studies is difficult as we do not compare tumour with normal samples, which may influence the expression data we see.Such a comparison would be worthy of investigation as clearly separate cell populations, such as that seen in our study, may provide a true representation of the local topographic PDAC microenvironment in comparison to studies which include both tumour and surrounding normal tissue, however carefully microdissected.
We are confident topographic TMAs can support a more holistic characterisation of genuine tumour microenvironments across many solid tumours.It is worth noting that the value of a particular representative areas may vary depending on the tissue type being evaluated.Within PDAC specifically, it could be argued that the heterogeneity associated with the desmoplastic reaction may be the most important in terms of representative area, per our described observations.Whether this representative area is the most definitive in larger cohorts remains to be seen.Cases with highest desmoplastic reaction presented the highest number of mutations, coupled with positive TGF-β expression, which itself presented with non-tumour pathophysiological distribution that may have immunotherapy treatment implications in PDAC.
The value of topographic TMAs as evidenced within our manuscript must also be caveated by the labour intensive nature of the creation of such a resource.Our approach to topographic TMAs creation was to utilise pathologist annotations, the clinical gold standard 37 .Yet, the diagnostic burden and reduction in pathologist workforce only compounds the difficulty faced when implementing our approach.However, the capabilities of deep learning offer opportunities that can be leveraged here via the algorithmic identification of representative areas to support the pathologist.The identification of varied tissue structures in PDAC H&E images by deep learning models has been proposed to support the routine diagnostic workflow 38 .Utilising proposed tissue classes presented by heat map on H&Es could accelerate representative area selection.Additionally, combining such mark-up images with the capabilities of high throughput TMA generation equipment, which supports the remote reviewing and selection of cores by an annotator, could significantly reduce the burden on the pathologist and support the implementation of this approach on larger scale TMA cohort construction.
The observations made in this study are preliminary and should be considered proof-of-principle.However, our results demonstrate the viability and strength of Topographic TMAs in PDAC, which comprehensively captures the extensive biomarker heterogeneity of pathophysiological regions scored digitally.This approach enriches the perception of biomarkers, with significant advantages to traditional TMAs (lacking complex representation) and full sections (where the histological hallmarks may not be present/obvious and the overall scoring can thus be misleading) 39 .That being said, TMAs have been used extensively to successfully interrogate many aspects of the tumour microenvironment for patient prognostication, by ourselves and others 16,40,41 .
In essence, in an era in which cancer heterogeneity, tissue microenvironment and tumour evolution dictates the complexity of cancer, Topographic TMAs may represent a conditio sine qua non for accurate biomarker evaluation.

Figure 1 .
Figure 1.(A) H&E of TMA design from 12 PDAC blocks from which the described regions were selectively cored into a single recipient block.(i) Tumour rich, (ii) desmoplastic reaction, (iii) peri-neural invasion, (iv) peri-vascular extension, (v) tumour margin (epithelial-mesenchymal transition), (vi) lymphocytic inflammation and (vii) intravascular invasion, when obvious.(B) H&E and relevant immune markers to describe the accuracy of the TMA annotations (i-vii, as described above).(C) Line graph displaying the range of CD3 (i) and SMA (ii) in each of the topographical areas for cases 1-12.(D)(i) bar chart displaying the cumulative CD3 percentage positivity for cases 1-12 encompassing the topographic areas where data was available, with SMA data shown in (ii).(D)(iii) Expression of IDO1 in the tumour and stromal compartments, in different topographic areas for cases 1-12, T, tumour; S, stroma.1E, focal IDO-1 IHC expression in a TMA core at × 4 and × 20 and TGF-β mRNA expression by RNA-ISH shown at × 20 and × 40.

Figure 2 .
Figure 2. (A)(i) Scatter plot displaying the correlation between IHC and RNA-ISH for C-Met expression across all cores where both data were available.(ii) Representative images of concordance and discordance of c-Met staining in the two hybridisation methodologies.(B)(i) bar chart displaying the cumulative c-Met IHC percentage positivity for cases 1-12 encompassing the topographic areas where data was available, with c-Met RNAScope data shown in (ii).

Figure 3 .
Figure 3. (A) Principle Component Analysis of the RNA data showing clustering.panCK status outlined in green.(B) Unsupervised hierarchical clustering of the 1000 most variable genes across all datasets clustered together based on the IHC status.Green box shows distinct panCK differences from two samples in case 5.The blue box highlights differences in the S100β+ sample from case 5, while the red box shows the difference panCK-ve/S100β− samples.(C) Displays the difference in heterogeneity across topograpahy within the same case (case 5), across perivascular extension, perineural invasion, and tumour margin.(D) Heat map showing the range of heterogeneity within the same core, where two samples were taken across the same topographical areas.
Samples in this study (NIB Study Ref: NIB15-0168) were provided by the Northern Ireland Biobank.The Northern Ireland Biobank is a HTA Licenced Research Tissue Bank with generic ethical approval from The Office of Research Ethics Committee Northern Ireland (ORECNI REF 21/NI/0019) and can confer ethical approval for projects which have received human tissue from the bank.

Table 1 .
Comparison of specific tumour mutations with the IHC or RNA-ISH score of biomarkers known to influence tumorigenesis.MT: Mutant.WT: Wild Type.

Table 2 .
K-means clustering of gene expression patterns displayed by gene ontology biological process.